Constant Approximation for k-Median and k-Means with Outliers via Iterative Rounding
نویسندگان
چکیده
In this paper, we present a new iterative rounding framework for many clustering problems. Using this, we obtain an (α1 + ≤ 7.081 + )-approximation algorithm for k-median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen [16]. For k-means with outliers, we give an (α2 + ≤ 53.002 + )-approximation, which is the first O(1)-approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give α1and (α1 + )approximation algorithms for matroid and knapsack median problems respectively, improving upon the previous best approximations ratios of 8 [42] and 17.46 [9]. The natural LP relaxation for the k-median/k-means with outliers problem has an unbounded integrality gap. In spite of this negative result, our iterative rounding framework shows that we can round an LP solution to an almost-integral solution of small cost, in which we have at most two fractionally open facilities. Thus, the LP integrality gap arises due to the gap between almost-integral and fully-integral solutions. Then, using a pre-processing procedure, we show how to convert an almost-integral solution to a fully-integral solution losing only a constant-factor in the approximation ratio. By further using a sparsification technique, the additive factor loss incurred by the conversion can be reduced to any > 0.
منابع مشابه
A New Approximation for the Null Distribution of the Likelihood Ratio Test Statistics for k Outliers in a Normal Sample
Usually when performing a statistical test or estimation procedure, we assume the data are all observations of i.i.d. random variables, often from a normal distribution. Sometimes, however, we notice in a sample one or more observations that stand out from the crowd. These observation(s) are commonly called outlier(s). Outlier tests are more formal procedures which have been developed for detec...
متن کاملConstant-Factor Approximation for Ordered k-Median
We study the Ordered k-Median problem, in which the solution is evaluated by first sorting the client connection costs and then multiplying them with a predefined non-increasing weight vector (higher connection costs are taken with larger weights). Since the 1990s, this problem has been studied extensively in the discrete optimization and operations research communities and has emerged as a fra...
متن کاملApproximation Algorithms for Aversion k-Clustering via Local k-Median
In the aversion k-clustering problem, given a metric space, we want to cluster the points into k clusters. The cost incurred by each point is the distance to the furthest point in its cluster, and the cost of the clustering is the sum of all these per-point-costs. This problem is motivated by questions in generating automatic abstractions of extensive-form games. We reduce this problem to a “lo...
متن کاملApproximation Schemes for Clustering with Outliers
Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, k-median, and k-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons. For instance, some points may arise from noise in a data set or one might b...
متن کاملAlgorithms with Provable Guarantees for Clustering
In this talk, we give an overview of the current best approximation algorithms for fundamental clustering problems, such as k-center, k-median, k-means, and facility location. We focus on recent progress and point out several important open problems. For the uncapacitated versions, a variety of algorithmic methodologies, such as LP-rounding and primal-dual method, have been applied to a standar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.01323 شماره
صفحات -
تاریخ انتشار 2017